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We introduce a stochastic model of growing networks where both, the number of new nodes which 
joins the network and the number of connections, vary stochastically. We provide an exact mapping 
between this model and zero range process, and use this mapping to derive an analytical solution 
of degree distribution for any given evolution rule. One can also use this mapping to infer about 
a possible evolution rule for a given network. We demonstrate this for protein-protein interaction 
(PPI) network for Saccharomyces Cerevisiae. 
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Study of networks has been gaining recognition as a 
fundamental tool in understanding the dynamical behav- 
ior and response of real systems coming from different 
field such as biology, social systems, technological sys- 
tems etc [E 0, S 0, Hi- Different network models have 
been proposed to study and understand these systems 
having underlying network structure. Erdos and Renyi 
random networks model was one of the oldest one, which 
shows that the probability (p(k)) of a node having de- 
gree k follows exponential distributions, p(k) oc exp(— k) 
6]. Many real world networks however show scale-free 
behavior, p(k) oc fc~ 7 , with the most striking examples 
of World Wide Web and cellular networks 

0,1 (for a re- 
view of scale-free networks refer [2j ) . In WWW, the num- 
ber of incoming links follows power law with the value of 
7 ~ 1.94 and analysis of metabolic networks of 43 
organisms reveal that the number of chemical reactions 
(link) in which a substrate (node) is involved in, show 
power law distribution, with the exponent varying be- 
tween 2.0 and 2.4 ||. 

To capture scale-free behavior of real world networks, 
Barabasi- Albert (BA) proposed a growing network model 
based on the preferential attachment of the nodes @, Q . 
In the BA model each new node is connected with some 
old nodes with a probability linearly proportional to the 
degree of the node, u(k) cx (k + j3). This model gives 
rise to the scale-free network with degree distribution 
following power law p(k) oc fc -7 , value of 7 = 3 + /? 
@ . Since then, several variations of B A algorithm have 
been proposed. An algorithm suggested by Dorgovtsev 
and Mendes based on the a ging of the nodes also gives 
rise to a scale-free behavior [10|j . Krapivsky et. al. also 
attempted to provide an analytical solution for different 
attachment function u(k) ~ fc A [TlT|. 

BA algorithm concentrates only on the degree distri- 
bution. Watts and Strogatz [l2j proposed a model which 
captures the small diameter and large clustering proper- 
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ties shown by real world networks. Clustering coefficient 
basically measures the number of triangles, i.e. complete 
subgraphs or cliques of order 3, in the network. Apart 
from cliques of the size 3, real world networks exhibit 
modular structures of higher levels FDal- For examples, 
in protein binding network of yeast [14j | cliques of the 
size upto 14 nodes are present in the number much higher 
than 'random' [15| . These small subgraphs are often con- 
sidered to be building blocks of a network. Densities of 
a particular subgraph may tell if a network belongs to 
a certain superfamily [l6| or perform specific functions 
[l7l | . With all these insight into real world networks and 
in oder to capture these properties, particularly degree 
distribution and modules or cliques statistics, different 
other models [H, [lj| and evolution rules have been pro- 
posed [2(J. In particular, Rozenfled and ben-Avraham 
[2l| proposed a local strategy for constructing scale-free 
network with external parameters capturing statistical 
properties of certain modular structures along with de- 
gree distribution. 

In this paper we introduce stochasticity to the grow- 
ing network models. Starting from the few initially con- 
nected nodes, a network in our model evolves as follows. 
At each time step, n new nodes joins the network and 
make m connections with existing nodes. Both m and n 
are taken as stochastic variables. Each new connection 
is made with a probability which depends on the degree 
of the node to be connected, need not be preferential. A 
special case of our model with linear connection proba- 
bility and n = 1, corresponds to the BA algorithm. Note 
that our evolution rule, being stochastic, naturally cap- 
tures various stochastic effects which are always present 
during the evolution of any real system. 

First we show an explicit mapping between our model 
and the zero range process (ZRP), an exactly solvable 
model in non-equilibrium physics [22j], which provides 
an exact relation between any attachment rule u(k) and 
the degree distribution p(k) of the growing networks. So 
far there are several attempts to solve Barabasi-Albert 
model where u(k) is linear in fc, Dorgovtsev et. al. being 
the most close one [23[ . These authors also did analytical 
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calculations for certain other forms of preferential attach- 
ments [24|. Krapivsky et. al. [ll[ have given analytical 
solution for u(k) ~ k x . Here, we provide exact degree 
distribution for any arbitrary evolution rule u(k). This 
relation, being exact, can be inverted to infer about a 
possible evolution rule for any given real-world network. 
Second, we show that the choice of stochastic parame- 
ters do not alter the degree distribution of the network. 
It only affects the correlations or statistical properties 
of the modules. Lastly we apply our methodology to 
a real world network and derive an stochastic evolution 
rule which captures the exact degree distribution. We ar- 
gue that this method can be used to generate a growing 
network with any desired degree distribution. 

First , the model. A generic algorithm for a growing 
network would be as follows. Starting from a small con- 
nected network, say with two nodes which are connected 
by a link, one brings n new nodes at each iteration time t 
and then each of these n nodes connects to m(i), i = 1, n 
existing nodes. In general, n and m are stochastically 
varying positive integers drawn from distributions r](n) 
and h(m) respectively. These variations are not just the 
generalizations of [9] , it is quite natural that at some time 
variable number of nodes join realistic networks and make 
connections which vary from one node to the other. The 
probability that any given new node i makes a link with 
one of the existing node j is w(k(J), t), where k{j) is the 
degree of j and J2j w {k{j)i t) = 1- 

Now, let us find the steady state degree distribution 
p{k) of these generic networks as t — > oo. Let M(k,t) 
be the number of nodes having k links at time t. Since 
w(k,t) = 1, we may take w(k,t) = u(k)/v{t) where 



;(t) =^2u(k)M(k,t). 



(1) 



Here u(k) is considered to be a generic function, need 
not be an increasing function which corresponds to the 
preferential attachment [H, [ll| . The rate of increase of 
M(k,t) is, then, given by 



dM(k,t) 
Jt 



mn 



u(k - 1) 



v(t) 



M(k- l,t) 



nh(k) 



u{k) 



M(k,t) 



(2) 



where n = Y] nr)(n) is the average number of nodes which 
joins the network in each iteration step t. Equation ([2]) is 
constrained by by M(0, t) — 0, which ensures that every 
node in the network has nonzero links. The initial condi- 
tion is M(k, 0) = 28k,i, is, we start with two nodes which 
are connected. Of course ([2]) must be supplemented by 
the equation of growth rate of nodes, 



dN(t) 
dt 



(3) 



In general, n may explicitly depend on t if r\ explicitly 
depend on t. We will considered this case later in this 
article. The degree distribution p(k) in the steady state 



is defined as, 



p(k) = lim ( 



M(M) , 
N(t) ' 



(4) 



where averaging (...) is done over realizations. Clearly 
the steady-state is reached only if M(k,t) cx N(t) for 
large t. Thus in the steady state, we have 



M(k,t) =p(k)N(t). 



(5) 



Here, we make an ansatz that the product form {5} holds 
even for large, but finite t. We will provide evidences in 
favor of this ansatz later in this article. 
Using Eq. ([5]) one can rewrite @ as 



1 v(t) u(k - l)p(k - 1) - u(k)p(k) 



ffiN{t) 



p{k) - h{k) 



(6) 



Clearly, only a constant function, say a, satisfies above 
equation and we have, 



u{k-l) ah(k) 
P( k > = i 77TH fc ' 1 ) + 



a 



a + u{k) 

1 v(t) _ 
rh N(t) fh 



a + u(k) 



(7) 



(8) 



There are few things to note here. First, that n do 
not appear in these equations. Thus, one may fix it to 
any arbitrary value without changing the degree distribu- 
tion. We would argue and show later in this article that 
these irrelevant (with respect to degree distribution) pa- 
rameters may marginally affect the correlations in the 
network. Second, that p(k) is in fact normalized, which 
can be proved by summing Eq. ([7]) for all k. 

Solution of the difference equation ([7]) with natural 
boundary condition p(0) = can be written in a com- 
pact form 



p(k) 



u(k) ^ 

y ' m=l 



e Mm) n 



u{j) 



-t a + u (i) 



(9) 



However, the main difficulty remains in finding a, which 
has to be self- consistently determined by using (UJ-©. 

First, let us consider the well studied case where at 
each time step only one node having tuq links joins net- 
work. Then n — 1 and h(m) — <5 m ,m - Thus only a single 
term m = mo in Eq. @ survives under the sum, and we 
have p{k) = for k < mo- For k > toq, 



p(k) 



a 



n 



u(k) -1-1 a + u(j) 

V ' J=in a XJJ 



(10) 



If we use BA- algorithm with preferential attachment 
rule u(k) = k + /3, the degree distribution becomes 



p(k) = a 



T(a + (3 + m ) T(/3 + k) 
r(l + a + P + k)T(/3 + m )' 



(11) 
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which can be used further to obtain a = 2 + (3/ nig from 
([S]). Clearly for the large values of k, p(k) ~ fc _1_0! . 
Thus the linear attachment rule u(k) = k + /3, generates 
a scale-free network with 7 = 3 + /3/mo- In the original 
formulation of Barabasi- Albert Q, /3 was taken to be 
zero and thus 7 = 3. 

In the following we discuss the mapping of our growing 
network model with the ZRP. Eq. (fit)]) gives an explicit 
connection between the two. In ZRP, particles hop be- 
tween the sites of a lattice with rate w(k) where k is the 
the occupancy of the departure site. The steady-state 
distribution of particles ir(k) in this model can be calcu- 
lated exactly as Tt(k) — ■^'Yl^ =1 w(k)~ 1 , where Af is a 
normalization constant. From (|10p one can identify that 
7r(fc) = p(k)u(k) and then (JSj) becomes a normalization 
condition for 7r(fc). Corresponding rate is then 

W (k) = \] + ^ (12) 
w [1 for k < m . 




FIG. 1: Degree distribution for the PPI network for Saccha- 
romyces Cerevisiae [25|. The evolution rule ti(fc) derived using 
(|15p is shown in the inset. The solid line here (inset) is a linear 
fit u(k) = k — .8, for which one expects p(k) ~ k~ 2 ' 2 . A solid 
line with slope —2.2 is drawn in the main figure to compare 
p(k) with the theory. 



Now asymptotic behavior of ir(k), and thus p(k), may 
be obtained from the known results of ZRP[22j. To ex- 
plain the importance of this ma ppin g, let us take the 
example u{k) = k x considered in [ll|. There are follow- 
ing three different possibilities. For < A < 1, n(k) is a 
stretched exponential and thus p(k) ~ exp(— afc 1_A /(l — 
A))fc _A . For A = 1 one gets p(k) ~ k~( a+1 \ Again, for 
A > 1, 7r(fc) asymptotically reaches a constant and thus 
we have distribution p(k) k~ x . 

One can also obtain the asymptotic behavior of p(k) by 
taking the continuum limit x = k/ K where K is the max- 
imum possible links (an arbitrarily large number). The 
difference equation ([7]) becomes a differential equation 

r^^-p(x)u(x) = a, 

p(x) ax 

with boundary condition p(xo) — , a , , where xq = 
mo/K. A formal solution is then, 

r \ a 1 ( f x dx' \ , 

P{x) = f , exp -a; / t jz I (13) 
u(x )u(x) \ J u(x')J 

1 f 

a = — / dku(x)p(x) (14) 
xo J 

It is easy to check that the above equations provide cor- 
rect asymptotic values for exactly solvable cases, u{k) = 
k + f3 and u(k) = k x . 

Let us emphasize at this point that, although writing a 
close form expression for p(k) for generic u(k) is difficult, 
asymptotic behavior can be obtained easily using (fTUj) or 
(fT4)l . As far as exact derivation oip(k) is concerned, one 
may numerically implement (J9j) and (jSJ); i.e., by iterating 
© and ©, and assuming an initial a. In most cases, we 
observe that a converges rapidly (within 15 iterations) 
to a constant. 

It is important to note that Eq. © can be inverted to 



get 

"(fcH^EM)-^)] ( 15 ) 
P{K > i=l 

Here, a appears as an multiplicative constant which 
can be dropped as it is irrelevant for the evaluation of 
p(k). Eq. (|15| provides an insight about a possible evo- 
lution rule for any real world network. For example we 
take protein-protein interaction (PPI) network for Sac- 
charomyces Cerevisiae (yeast) [25]]. The largest con- 
nected part has N = 3930 nodes and M — 7725 links. 
The degree distribution of this network is shown in Fig. [T] 
The average degree of this network is 3.93 which may be 
modeled using h(m) = 0A5 m> \ + 0.234<S mj2 + 0.366<5 mi3 . 
We evaluate u(k) for this network (shown in the inset of 
Figfrj using (|15p which fits well with a linear function 
u(k) = 1.5(fe — .8). Note that for this fitting we ignore 
large k values as for these values, p(k) is very small and 
sometimes zero also. Corresponding degree distribution 
is now expected to be scale- free p(k) ~ k~ 2 - 2 , which is 
consistent with the observed distribution. 

Now, we turn our attention to the other stochastic 
parameters rj(n), namely the distribution of number of 
nodes which join the network during each iteration time 
step t. We have seen in ([9]) that r){n) do not alter the 
degree distribution. However they marginally affect cor- 
relations or the statistical properties of modular struc- 
tures in the network. To illustrate this point, we gen- 
erate a network with u{k) = k + 0.5, h{m) = 5 m A and 
rj{n) = q5 nt i + {l—q)8 n ^ 1 and measure the clustering coef- 
ficient for different q. As explained in the Fig. we find 
that the clustering coefficient changes only marginally 
with q. 

Our analysis here rely on the fact that Eq. |5]| holds 
for large networks (as t — > oo). Let us check the va- 
lidity of (0 in details. From (jHJ) it is clear that v(t) is 
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FIG. 2: Clustering coefficient changes with a stochastic pa- 
rameter q (see text). Other parameters are u(k) — k + 0.5, 
N = 1000, m — 4, r)(n) = g<5 nj i + (l — q)S n ,s and the clustering 
coefficient is averaged over 1000 realizations. 




FIG. 3: Log scale plot of v(t) for two different cases : (a) 
rj(n) = 0.6<5i, n + QA&2, n and (b) n(t) = It is expected 
from ([3]) and (JSj) that v(t) is linear in first case, whereas for 
(b) v(t) ~ t 3 ^ 2 . Solid lines with slope 1 and 1.5 are drawn 
for comparison. For both cases, u(k) — k — .5 and m = 
1 and averaging is done over 1000 realizations. The degree 
distribution p(k) ~ k~ 2 ' 5 (inset) is identical for both cases. 



proportional to N(t) which can be obtained from ([3]). 



First, we numerically evaluate v(t) for few different net- 
works and compare them with the theoretical results ([3]). 
If the number of new nodes n is a stochastic variable 
then N(t) — fit + 2, is linear. However one can intro- 
duce an explicit time dependence in n to get non-linear 
N(t). For example, if n(t) = \fl we have N(t) = t 3 / 2 + 2 
and thus v(t) oc t 3 / 2 . In figure ([3j we plot numerically 
measured v(t) in log scale for two different cases; (a) 
n = 0.65 n> i + QA8 n a and (b) n(t) = y/t, both agree well 
with (J3J. Although N(t) is quite different, p(k) (shown 
in the inset) was found to be same as expected. For both 
the cases evolution rule is u(k) = k — .5 and thus we have 
p(k) ~ fc~ 2 - 5 . To conclude, Eq. (JSJ) holds quite well after 
as few as (t ~ 10) iterations. For large networks, the 
number of nodes which join in first few iteration steps is 
vanishingly small as compared to the size of the network, 
hence do not affect the network properties. 

In summary, we introduce a generic model of stochas- 
tically growing network and show that this model can 
easily be mapped to the ZRP and thus enabling us to de- 
rive an exact relation between the degree distribution of 
network and its evolution function. This relation can be 
used to derive analytical form of the degree distribution 
for any arbitrary evolution rule and conversely for a given 
network data we can infer about a possible evolution rule. 
Our evolution rule produce exact degree distribution, as 
obtained from the given network data, even for small k 
values. We demonstrate this by taking example of a real 
world PPI networks and deriving a possible evolution rule 
to this network. 

Based on our exact calculations we expect to get the 
better understanding of the the evolution of real world 
networks. Also, since ZRP is exactly solvable, mapping 
of ZRP with network growth models, opens up a platform 
to study the interplay between evolution rules and steady 
state degree distribution. 

One of us (PKM) acknowledges MPIPKS for the hos- 
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